
Recently, design and analysis of NoCs has gathered significant momentum because of the criticality of
the communication substrate in designing scalable, high performance and energy efficient multicore
systems. However, most NoCs have been designed in a monolithic manner without considering the actual
application requirements. We argue in this paper that such an approach is sub-optimal from both the
performance and energy standpoints and propose an application driven approach to designing NoCs.
Based on the characterization of several applications, we observe that a heterogeneous NoC consisting
of two separate networks, one optimized for bandwidth and other for latency, can cater to the
applications' requirement more effectively.

%Hence, we propose a novel packet classification scheme that uses the communication episode length and
%height to dynamically classify applications to steer them into one of the networks. Further,
%observing the property that not all applications are equally sensitive to latency or bandwidth, we
%devise a fine grain ranking of applications within the bandwidth and latency optimized sub-networks.
%Our rankings are empirically obtained using clustering algorithms and are very simple to implement in
%the network without requiring global co-ordination among any routers.

We evaluate the effectiveness of the proposed two-layer network over a range of monolithic designs.
Evaluations with 36 benchmarks on a 64-core 2D architecture indicate that the proposed two-layer
heterogeneous network approach consisting of a 256b-link (64-link) bandwidth (latency) optimized
network provides 34\%/24\% (system/application) throughput improvement over a 128b-link network, and
is 5\%/3\% better in weighted/instruction throughput while consuming 47\% lower energy when compared
to an iso-resource (320b-link) single network (59\% lower energy when compared to a very high
bandwidth 512b-link network). In a combined performance-energy design space, the proposed
application-driven NoC outperforms all competitive designs. In conclusion, while multiple on-chip
networks have been proposed in the literature, none of these are based on a systematic,
application-driven approach like ours. Also, the proposed communication episode based classification
and ranking schemes are significantly better than state-of-the-art NoC prioritization mechanisms.

%We conclude that the proposed interconnection framework provides a promising way to host diverse
%applications on a CMP and build many-core NoCs that can provide high system and application
%performance at reduced energy/power consumption.

%Our proposal is within 2.5\%/4\% (system/application throughput) of an iso-resource
%320b-link network and within 2.5\% of a very high-bandwidth (512b) network.
